Skip to content

feat: allow bwa-mem to output sam format#11457

Merged
piplus2 merged 3 commits into
nf-core:masterfrom
piplus2:feat-bwa-mem-output-sam
May 11, 2026
Merged

feat: allow bwa-mem to output sam format#11457
piplus2 merged 3 commits into
nf-core:masterfrom
piplus2:feat-bwa-mem-output-sam

Conversation

@piplus2
Copy link
Copy Markdown
Contributor

@piplus2 piplus2 commented May 1, 2026

PR checklist

Closes #11456

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

The patch allows BWA/MEM to produce SAM format outputs when task.ext.args2 = '--output-fmt sam'. In that case, we can skip samtools view as bwa mem already produces that format.

@piplus2 piplus2 requested a review from maxulysse as a code owner May 1, 2026 09:18
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 1, 2026

There's a possible issue with the subworkflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO.
Also with the untouched code, the CI fails:

🚀 nf-test 0.9.5
https://www.nf-test.com
Please cite: https://doi.org/10.1093/gigascience/giaf130
(c) 2021 - 2026 Lukas Forer and Sebastian Schoenherr

Load .nf-test/plugins/nft-anndata/0.4.1/nft-anndata-0.4.1.jar
Load .nf-test/plugins/nft-bam/0.6.1/nft-bam-0.6.1.jar
Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar
Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar
Load .nf-test/plugins/nft-fastq/0.1.0/nft-fastq-0.1.0.jar
Load .nf-test/plugins/nft-utils/0.0.9/nft-utils-0.0.9.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar

Test Subworkflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO

  Test [517809d8] 'single_umi' FAILED (7.818s)

  Assertion failed:

  assert workflow.success
         |        |
         workflow false

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/517809d8b122e70d816e16bb19a4ce62/meta/nextflow.log' file for details
  Nextflow stderr:



  Test [4c722dd0] 'duplex_umi' FAILED (7.009s)

  Assertion failed:

  assert workflow.success
         |        |
         workflow false

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/4c722dd06f1e8251582b24ecc780003e/meta/nextflow.log' file for details
  Nextflow stderr:



  Test [f9597a38] 'single_umi - stub' Assertion failed:

assert workflow.success
       |        |
       workflow false

java.lang.RuntimeException: Different Snapshot:
[                                                                                                       [
    {                                                                                                       {
        "consensusbam": [                                                                                       "consensusbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_consensus_unmapped.bam:md5,d41d8cd98f00b204e9800998ecf8427e"          <
            ]                                                                                      <
        ],                                                                                                      ],
        "groupbam": [                                                                                           "groupbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_umi-grouped.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                 <
            ]                                                                                      <
        ],                                                                                                      ],
        "mappedconsensusbam": [                                                                                 "mappedconsensusbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                             <
            ]                                                                                      <
        ],                                                                                                      ],
        "ubam": [                                                                                               "ubam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_unaligned.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                   <
            ]                                                                                      <
        ]                                                                                                       ]
    }                                                                                                       }
]                                                                                                       ]

FAILED (6.853s)

  Assertion failed:

  2 of 2 assertions failed

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/f9597a38552d027b8d67056807e8428/meta/nextflow.log' file for details
  Nextflow stderr:



  Snapshots:
    Obsolete snapshots can only be checked if all tests of a file are executed successful.


FAILURE: Executed 3 tests in 22.66s (3 failed)

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 4, 2026

I think the CI fail because of this bug #11519

@famosab
Copy link
Copy Markdown
Contributor

famosab commented May 6, 2026

We usually want to have compressed output so having sam is not the normal output we would expect. Why do you want this to directly output sam?

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 6, 2026

The current code shows an inconsistency: when task.ext.args2 = '--output-fmt sam' it never returns the sam file. So this commit makes the code work as expected. If we want to exclude the sam, then it should be forbidden in the args2 or at least there should be a feedback about this behaviour.

@famosab
Copy link
Copy Markdown
Contributor

famosab commented May 6, 2026

That is a fair point! I think then we need to solve the other bug and add a test for this new output file and we should be good to go.

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 6, 2026

I'm working on the fix of the subworkflow that makes the CI fail.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from 7404fd2 to ccf7952 Compare May 7, 2026 08:13
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 7, 2026

the CI fails because of #11548, waiting PR #11549 to be merged.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from ccf7952 to e8e3e1b Compare May 7, 2026 10:14
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 7, 2026

I'm updating the snapshot to match the modified test, which expects the sam output too now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where you actually expect the sam file in the output please? :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added process.out.sam, do you think it may be worth writing a specific test for the sam only? I did not include it to match the cram, csi and crai.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory we should test each output properly. So yes please add it :) The way its done for nw we can already see that sam is not created when it should not be created (which is good).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do it :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added 2 tests (single and paired read) to check that we actually get a sam output.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch 2 times, most recently from 2e4c51d to c8e4f0a Compare May 7, 2026 11:38
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the ext.args to be present in the main.nf.test file. That makes everything more readable in one go. See the module specifications for more info.

Other than that good job 🥳

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, my bad! Fixing it now!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks all fine now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not properly changed now, can you check the specs again and update accordingly? :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, apparently I kept the old file :( fixing it now

@piplus2 piplus2 requested a review from famosab May 7, 2026 14:54
@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from 0a900f8 to 17a2c56 Compare May 7, 2026 16:49
@piplus2 piplus2 enabled auto-merge May 7, 2026 16:54
@piplus2 piplus2 self-assigned this May 7, 2026
@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from 17a2c56 to e04edc0 Compare May 8, 2026 09:00
@piplus2 piplus2 closed this May 11, 2026
auto-merge was automatically disabled May 11, 2026 09:54

Pull request was closed

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from e04edc0 to 4e22228 Compare May 11, 2026 09:54
@piplus2 piplus2 reopened this May 11, 2026
Copy link
Copy Markdown
Contributor

@famosab famosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! 🚀

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 11, 2026

Thanks for helping me! 🥳 Re-checking to be sure everything is fine before merging.

@piplus2 piplus2 added this pull request to the merge queue May 11, 2026
Merged via the queue into nf-core:master with commit 2fb127c May 11, 2026
65 checks passed
@piplus2 piplus2 deleted the feat-bwa-mem-output-sam branch May 11, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update module: BWA/MEM

2 participants